Search Results

Documents authored by Ferreira, João


Document
Short Paper
Assessing Factoid Question-Answer Generation for Portuguese (Short Paper)

Authors: João Ferreira, Ricardo Rodrigues, and Hugo Gonçalo Oliveira

Published in: OASIcs, Volume 83, 9th Symposium on Languages, Applications and Technologies (SLATE 2020)


Abstract
We present work on the automatic generation of question-answer pairs in Portuguese, useful, for instance, for populating the knowledge-base of question-answering systems. This includes: (i) a new corpus of close to 600 factoid sentences, manually created from an existing corpus of questions and answers, used as our benchmark; (ii) two approaches for the automatic generation of question-answer pairs, which can be seen as baselines; (iii) results of those approaches in the corpus.

Cite as

João Ferreira, Ricardo Rodrigues, and Hugo Gonçalo Oliveira. Assessing Factoid Question-Answer Generation for Portuguese (Short Paper). In 9th Symposium on Languages, Applications and Technologies (SLATE 2020). Open Access Series in Informatics (OASIcs), Volume 83, pp. 16:1-16:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{ferreira_et_al:OASIcs.SLATE.2020.16,
  author =	{Ferreira, Jo\~{a}o and Rodrigues, Ricardo and Gon\c{c}alo Oliveira, Hugo},
  title =	{{Assessing Factoid Question-Answer Generation for Portuguese}},
  booktitle =	{9th Symposium on Languages, Applications and Technologies (SLATE 2020)},
  pages =	{16:1--16:9},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-165-8},
  ISSN =	{2190-6807},
  year =	{2020},
  volume =	{83},
  editor =	{Sim\~{o}es, Alberto and Henriques, Pedro Rangel and Queir\'{o}s, Ricardo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2020.16},
  URN =		{urn:nbn:de:0030-drops-130298},
  doi =		{10.4230/OASIcs.SLATE.2020.16},
  annote =	{Keywords: Question-Answer Generation, Corpus, NLP, Portuguese}
}
Document
Improving NLTK for Processing Portuguese

Authors: João Ferreira, Hugo Gonçalo Oliveira, and Ricardo Rodrigues

Published in: OASIcs, Volume 74, 8th Symposium on Languages, Applications and Technologies (SLATE 2019)


Abstract
Python has a growing community of users, especially in the AI and ML fields. Yet, Computational Processing of Portuguese in this programming language is limited, in both available tools and results. This paper describes NLPyPort, a NLP pipeline in Python, primarily based on NLTK, and focused on Portuguese. It is mostly assembled from pre-existent resources or their adaptations, but improves over the performance of existing alternatives in Python, namely in the tasks of tokenization, PoS tagging, lemmatization and NER.

Cite as

João Ferreira, Hugo Gonçalo Oliveira, and Ricardo Rodrigues. Improving NLTK for Processing Portuguese. In 8th Symposium on Languages, Applications and Technologies (SLATE 2019). Open Access Series in Informatics (OASIcs), Volume 74, pp. 18:1-18:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{ferreira_et_al:OASIcs.SLATE.2019.18,
  author =	{Ferreira, Jo\~{a}o and Gon\c{c}alo Oliveira, Hugo and Rodrigues, Ricardo},
  title =	{{Improving NLTK for Processing Portuguese}},
  booktitle =	{8th Symposium on Languages, Applications and Technologies (SLATE 2019)},
  pages =	{18:1--18:9},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-114-6},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{74},
  editor =	{Rodrigues, Ricardo and Janou\v{s}ek, Jan and Ferreira, Lu{\'\i}s and Coheur, Lu{\'\i}sa and Batista, Fernando and Gon\c{c}alo Oliveira, Hugo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2019.18},
  URN =		{urn:nbn:de:0030-drops-108852},
  doi =		{10.4230/OASIcs.SLATE.2019.18},
  annote =	{Keywords: NLP, Tokenization, PoS tagging, Lemmatization, Named Entity Recognition}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail